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Abstract 

The Priifer code is a bijection between trees on the vertex set [n] and 
strings on the set [n] of length n — 2 (Priifer strings of order n) . In this 
paper we examine the 'locality' properties of the Priifer code, i.e. the 
effect of changing an element of the Priifer string on the structure of the 
corresponding tree. Our measure for the distance between two trees T, T* 
is A(T, T*) = n - 1 - \E(T) n E(T*)\. We randomly mutate the /rth 
element of the Priifer string of the tree T, changing it to the tree T* , and 
we asymptotically estimate the probability that this results in a change 
of I edges, i.e. P(A = £\fx).We find that P(A =i\n) is on the order of 
n -i/3+o(i) f or an y jaeger / > 1, and that P(A = 1 1 fi) = (l-/i/n) 2 +o(l). 
This result implies that the probability of a 'perfect' mutation in the 
Priifer code (one for which A(T, T*) = 1) is 1/3. 



1 Introduction 

The Priifer code is a bijection between trees on the vertex set [n] := {1, . . . , n} 
and strings on the set [n] of length n — 2 (which we will refer to as P-strings). 
If we are given a tree T, we encode T as a P-string as follows: at step i (1 < 
i < n — 2) of the encoding process the lowest number leaf is removed, and it's 
neighbor is recorded as pi, the ith element of the P-string 

P = (pi,...,p n - 2 ), Pi € [n], (l<i<n-2). 

We will describe a decoding algorithm in a moment. 

First we observe that the Priifer code is one of many methods of representing 
trees as numeric strings, [3], [B], [7]. A representation with the property that 
small changes in the representation lead to small changes in the represented 
object is said to have high locality, a desirable property when the representation 
is used in a genetic algorithm [5] , [BJ . The distance between two numeric string 
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tree representations is the number of elements in the string which differ, and 
the distance between two trees T, T* is measured by the number of edges in one 
tree which are not in the other: 

A = A ( ") = A^(T,T*) :=n- 1- \E(T) n E(T*)\, 

where E(T) is the edge set of tree T. 

By a mutation in the P-string we mean the change of exactly one element of 
the P-string. Thus we denote the set of all ordered pairs of P-strings differing 
in exactly one coordinate (the mutation space) by Ai, and by A4 M we mean the 
subset of the mutation space in which the P-strings differ in the fi th coordinate: 

n-2 

M = |J M», M„ := {(P, P*) : Pi = p* for i f /x, and P/1 ± P ;} , 
where 

P = (Pi, ■ ■■ ,Pn-2), P* = (Pi, ■ ■ ■ ,<- 2 ): 

so \M\ = n n ~ 2 (n — 2)(n — 1), and \M^\ = n n ~ 2 (n — 1). We choose a pair 
(P, P*) £ M uniformly at random, and the random variable A measures the 
distance between the trees corresponding to (P,P*). Using P({event}|o) to 
denote conditional probability, we have 

n-2 



P(A =£) = ^P(A =£\ (P,P*) eM^)P((P,P*) eM„) 

n-2 1 

^P(A,/|(F,r)e^)— . 



Hereafter we will represent the event (P, P*) € A4 M by /i, as in 

P ({event} | /i) := P ({event} | (P, P*) G 7W M ) . 

Computer assisted experiments conducted by Thompson (see [7] page 195- 
196) for trees with a vertex size as large as n = 100 led him to conjecture 
that: 

lim p(A (?,) =l) =-, (1.1) 

n^oo V / 3 

and that if [ijn — > a, then 

lim P ( A< n > = lU)=(l- q) 2 . (1.2) 

In a recent paper [5] , Paulden and Smith use combinatorial and numerical meth- 
ods to develop conjectures about the exact value of P (A = I \ [i) for I = 1, 2, 
and about the generic form that P (A = £ | /i) would take for £ > 2. These 
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conjectures, if true, would prove (|l.l[) - p.2[) . Unfortunately, the formulas repre- 
senting the exact value of P (A = I \ fi) are complicated, even for £ = 1, 2, and 
the proof of their correctness may be difficult. In this paper we will show by a 
probabilistic method that (|l.ip - (|1.2p is indeed correct, proving that 

P (AW = 1 1 fx) = (1 - v/n) 2 + O (n~ 1/3 In 2 n) , (1.3) 

and showing in the process that 

P(AW =^|m) = C>(V 1/3 ln 2 n) , (I > 1). (1.4) 

Of course (|1.3[) implies because L (1 — a) 2 da = 1/3. In order to prove 

these results we will need to analyze the following P-string decoding algorithm, 
which we learned of from [1] , [5] . 

1.1 A Decoding Algorithm 

In the decoding algorithm, the P-string P = (pi, . . . ,p n -2) is read from rear 
to front, so we begin the algorithm at step n — 2 and count down to step 0. 
We begin a generic step i with a tree T"i+i which is a subgraph of the tree T 
which was encoded as P. This tree has vertex set Vi+i of cardinality n — i — 1 
and edge set Si+i of cardinality n — i — 2. We will add to Ti+i a vertex from 
Xi + i := [n] \ Vi+i, and an edge, and the resulting tree Tj will contain T,+i as 
a subgraph. The vertex added at step i of the decoding algorithm is the vertex 
which was removed at step i + 1 of the encoding algorithm, and will be denoted 
by yi. A formal description of the decoding algorithm is given below. 

Decoding Algorithm 



Input: P = (pi, . . . ,p n ~ 2 ) and A„_i := [n - 1], 14-1 = {"-}, #n-i = 0, Pn-i ■= 
n. 

Step i (1 < i < n — 2): We begin with the set Xi + i and a tree Tj + i having 
vertex set Vi+i and edge set Ei + \. We examine entry pi of P. 

1. If pi e Xi+i, then set m = p,; . 

2. If pi ^ A i+1 , then let yi = maxX i+1 (the largest element of X i+1 ). 

In either case we add yi to the tree T i+1 , joining it by an edge to the vertex Pi+\ 
(which must already be a vertex of Ti+i). So Xj = A i+1 \ {yi}, V* = V^ + i U 
and Ej = E i+1 U { {y^Pi+i} }. 

Step 0: We add yo, the only vertex in X\, and the edge {yo,p\} to the tree 
Pi to form the tree T = T. 
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In this algorithm, we do not need to know the values of pi, . . . ,pi until after 
step i + 1. We will take advantage of this by using the principle of deferred 
decisions. With /i fixed, we will begin with p^+i, . . . ,p n -2 determined, but 
with pi, . , . jPp, as yet undetermined. We will then choose the values of the Pi 
for 1 < i < \i when the algorithm requires those values and no sooner. 

This will mean that the composition of the sets JQ, Vi, -E 1 , will only be deter- 
mined once we have conditioned on pi, . . . ,p n ~2- When we compute the proba- 
bility that pi~i is in a set Ai whose elements are determined by Pj, j > i, (for 
example Xi or V,) we are implicitly using the law of total probability: 

P (p,_i 6 At I m) = J2 P (p*-i e A I A ; m) P (Pi I m) , 

Pi 

where the sum above is over all P-sub-strings Pi — (pi, . . . ,p n -2) of the ap- 
propriate length, and P (Pj | /i) is the probability of entries i through n — 2 of 
the P-string taking the values (pi, . . . ,p„_2)- We will leave such conditioning as 
implicit when estimating probabilities of the type P (pi-i S Ai | /i) . 

In the next section, we will use the principle of deferred decisions to easily 
find a lower bound for P (A = 1 1 fi), and in later sections we will use similar 
techniques to establish asymptotically sharp upper bounds for P (A = 1 /y.) , as 
well as for P (A = t \ /i) (£ > I). The combination of these bounds will prove 
(OD-(Pl). 

2 Lower Bounds 

For a fixed value of fi, we will construct a pair of strings from M.^, starting our 
construction with two partial strings 

Pfi+l = {PlJL+U ■ ■ ■ ,Pn-2) , P^+l = (Pp+l, ■ ■ ■ ,Pn-2) > Pj = Pj, 

where has been selected uniformly at random from [n] for fi+l < j < n— 2. We 
have not yet chosen Pj,p* for j < /j,. We run the decoding algorithm from step 
n — 2 down through step fi + 1 , and at this point we have two trees T^+i = T* +1 
as which P p +i = P^ + i have been partially decoded. Of course we also have the 
sets V^+i = V* +1 and X^ +1 = X* +1 , where 

Vi '■— {j : J is a vertex of T^}, V* := {j : j is a vertex of T*}, 

and X t = [n] \ V tl X* = [n] \ V* . We let E u E* represent the edge sets of Tj, T*. 

Now we choose p^ and p* ^ p^, and execute step \i of the decoding algorithm. 
There are two possibilities: 

1. If both G V M+ i U {maxl^j, then y t = y* = maxX^+i. We have 
added the same vertex and the same edge (yi and {j/^p^+i}) to both T^+i 
and T* +1 . We have V M = V* and = E*. 

2. One of is not an element of the set V^+i U {maxl^i}. 
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We will denote the first of these two events by 

£ := {both Pli ,p* e V^+i U {maxA^+i}}, (2.1) 

and we will show that on this event, A = 1 no matter what values of pj = p* 
(1 < j < /i — 1) we choose to complete the strings P, P* . Thus 

£ C {A = 1} =► P(£ | n) < P(A = 1 1 fi). 

Let us now prove the set containment shown in the previous line. 
Proof. Suppose that event £ occurs, so that = V* and — X* and 
T p = T*. Now choose p\, . . . uniformly at random from [n], with p* = pi 

for 1 < i < p, — 1 . 

At steps [i — 1, fi — 2, . . . , of the algorithm, we will, at every step, read the 
same entry pi = p* from the strings P, P*. Because X^ = X* and p^-i = ?>n_i> 
the algorithm demands that we add to T M ,T* the same vertex y^-i = y^—v 
This in turn means that X^-i — X*_ 1 . In a similar fashion, for < i < fi — 2 
we have 

Xj+i = X* +1 =^yi = y*. 

Thus at every step i < /i of the algorithm we add the same vertex to Vt+i, V* +l . 
Furthermore, at every step we are adding the edge {yi,pi + \} to Ei + \ and the 
edge {yi,p* +1 } to E* +1 . Since pi = p* for i ^ /i and p M 7^ p*, we add the 
same edge to T^ + i and T* +1 at every step except at step /1 — 1 at which we add 
{j/ju-i'P/J to and {y^-i,^}) to T*. Of course the same edge 

cannot be added to a tree twice, so at no point could we have added {y^-i,^*} 
to T or {y^-ijp^} to T* . Thus T and T* must have exactly n — 2 edges in 
common, and 

A = A (n) (T,T*) :=n-l- |£(T)n-E(T*)| = 1. 

□ 

Note: We have proved that if Xk = X£ for k < fi then Xj = X* for all 
j < k, that the same vertex is added at every step j < fc, and that the same 
edge is added at every step j < min{/c, fi — 2}. We will need this result later. 



Now we bound the conditional probability of event £. 



P A = 1 M > P {£ M = ft f-— 

n n — 1 

= l-^4 + 0(n-). 

Thus we have 

P (A = 1 1 fx) > (1 - /i/n) 2 + O (n^ 1 ) . 

Of course P ({A = £} n £ \ fx) = for I > 1, so in order to prove (TQll-(fl~4|) it 
remains to show that 

P ({A = £} n £ c I fi) = O (V 1/3 In 2 n) , {i>l). (2.2) 
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This endeavor will prove more complicated than the upper bounds, so we will 
need to establish some preliminary results and make some observations which 
will prove useful later. 

3 Observations and Preliminary Results 

Recall that after step j of the decoding algorithm we have two sets Xj , X* of 
vertices which have not been placed in Tj , T* . For j > fj, + 1, we know that 
Xj = Xj, but we may have Xj ^ X* for j < [i. So let us consider then the set 

x r .= x 3 ux*. 

Our goal is to show that either Xj — Xj , or Xj consists of Xj n X* and of 
two additional vertices, one in Vj \ V* and one in V* \ Vj . This means Xj has 
the following form: 

Xj :={xi < ■ ■ ■ < x a < mm{zj,Zj} < x a+1 < ■ ■ ■ < x a+b < 

max{zj,Zj} < x a+b+c < ■ ■ ■ < x a+b+c }, (3.1) 

where 

z j^ V j\ V h XiEXjDX], (1 <i <a + b + c), 

and a, b, c > 0, with a + b + c = j — 1. We will consider a set Xj — Xj to also 
have the form shown above, but with \z,j, z*} — and b(J) — c(j) = 0, a(j) — j. 
Thus when showing that Xj is of the form (|3.1[) . our concern is to show that 1) 
there is at most one vertex Zj G Vj \ Vj, and 2) that there can be such a vertex 
if and only if there is exactly one vertex z* £ V* \ Vj, so \{Zj, z*}\ is or 2. 

For j > fi+ 1, the set Xj = Xj = X* , and it is easy to see that X^ is of the 
form (|3.1|) . Also, we showed in the previous section that if Xk = X£ for k < /i 
then Xj — Xj for all j < k. Thus it is enough to show that if Xj (j < /i) is 
of the form (J3TT|) with {zj,z*} ^ 0, then Xj- X is also of the form This 
will be shown in the process of examining what happens to a set Xj of the form 
(|3.ip (with {zj,z*} ^ 0) at step j — 1 of the decoding algorithm, an examination 
which will take most of this section. In this examination we present notation 
and develop results upon which our later probabilistic analysis will depend. We 
begin by considering the parameters a, b, c. 

Of course, 

a = a(j), b = b(j), c = c(j), 

depend on j, (and on p* and Pi, i > j), but we will use the letters a, b, c when 
j is clear. We let 

Aj := {xi < ■ ■ ■ < x a }, Bj := {x a+ i < ■ ■ ■ < x a+b }, 

and 

Cj := {x a + b +i < ■ ■ ■ < x a+b+c }, 
so Xj = Aj U Bj U Cj U {zj, Zj}. 
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Ultimately, we are interested not just in the set Xj, but in the distance 
between two trees, i.e. A. We will find it useful to examine how this distance 
changes with each step of the decoding algorithm, so we define 

A, = Af ] (T^T^T^T*^) := l-\E^E*\ + \E j+l ^E* +1 \, (0 < j < n-2), 

and observe that 

A w = n - 1 - \E n E%\ + \E n _! n E*_ x \ 

= A + • ■ ■ + A„_ 2 (3.2) 

(recall that T„_i is the single vertex n and T = To). We add exactly one edge 
to each tree at each step of the algorithm, so the function Aj has a range in 
the set { — 1,0, 1}. It is easy to check that A^ = 1 as long as min{p M , p*} ^ 
Vn+i U {maxl^+i} (so on £ c ), and that A^-i > (because p M ^ p*). Further, 
if Xj = X* and j < fi, then we will add the same edge at every step i < j, so 
Aj = for all i < j. 

Finally, we will need some notation to keep track of what neighbor a given 
vertex had when it was first added to the tree. Thus for v 6 {1, ... , n — 1} we 
denote by h(v) the neighbor of v in Tj, where j is the highest number such that 
v is a vertex of Tj . Formally, 

toiv = yj, h(v) = h P (v) := p j+1 , (P = (pi, . . . ,p n - 2 ))- (3.3) 

For example, if our string is (4, 3, 2, 2, 7), then 

h(l) = 4, h{2) = 7, h{3) = 2, h{A) = 3, ft(5) = 2, ft(6) = 7. 

Now we are prepared to examine the behavior of the parameters a, b, c, and to 
make some crucial observations about the behavior of Aj. In the process we 
will show that if Xj is of the form (|3.1|) with {zj, z*} ^ then is of the 

same form (but possibly with {zj—i,Zj_-y} — 0, meaning Xj^\ = Xj—i). The 
observations below apply to all 1 < j < fi, except observations about the value 
of Aj_x, which apply only to j < /x — 1. For j > /j we only need to remember 
that A M = 1 on £ c and A M _i > 0. 

1. If p 3 -i G U Bj U Cj, then = = p 3 -i, while Zj-i = Zj, z*_ 1 = 
Zj, and Aj-i = because we add the edge {pj_x,pj} to both of Tj,T*. 

(a) If Pj-i G Aj then a(j — 1) = a(j) — 1, while b(j — 1) = b(j) and 
c{j - 1) = c(j). 

(b) If pj_i £ then 6(j - 1) = &(j) - 1 while a(j - 1) = a(j) and 

c(i - i) = c(j). 

(c) If pj-i G Cj then c(j — 1) = c(j) — 1 while a(j — 1) = a(j) and 
6(7 - 1) = Hj)- 

Thus in every case, one of the parameters a, 6, c decreases by 1 while the 
others remain unchanged. 
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2. Suppose that Pj-i e Vj := VJ n F/. Then 

(a) If = c(j) = then % _x = z* and = Zj, so X,_x = X?^. 
While Aj_x could assume any of the values —1, 0, 1, we have Aj = 
for alH < j — 1. 

(b) First suppose that Zj < z* and b(j) > 0, c(j) = 0. Then x = x a +b 
and = z*, making z*_ 1 = x a +b, Zj-% = Zj. We have Bj_i = 
Bj \{x a+b }, so a(j - 1) = o(j), 6(j - !) = - 1, c(j - 1) = 0. 
Further, Aj_i = if and only if the event 

H*_ 1 :={h P *{z*)=p j } (3.4) 

occurs, and otherwise Aj_i = 1. 

Similarly, if z 3 - > z* and > 0, c(j) — 0, then yj-i = x a+b and 
= Zj with = x a+b , z*_ 1 — z*. The change in the values of 
a,b,c are the same as in the case of Zj < z*. We also have Aj_x = 
if and only if the event 

H j - 1 :={h P (z j )=p*} (3.5) 

occurs, and otherwise Aj_i = 1. In summary, if b(j) > 0, c(J) = 
and pj-i G Vj, then A 3 -_i = 1 unless Wj-x U occurs. 

(c) If b(j) > 0, c(j) > and G Vj then y*_ ± = y^x = x a+b+c G Cj, 
Zj-x — Zj, z|_x = Zj, and we have a(j — 1) = a(j), b(j — 1) = 

c ~ 1) — c 0) — 1- Since we add the edge {x a+b+c ,pj} to both 
of T 3 ,T* we have A 3 -_i = 0. 

3. Suppose that = max{zj, z*}. 

(a) If &(j) = c(j) = then the results are the same as in the case [5a] 

(b) If b(j) > 0, c(j) = then the results are the same as in the case I2bl 

(c) Suppose b(j) > 0,c(j) > 0. If Zj < z* and Pj-i = z* then t/|_ a = 
x a+b+c and 2/j_i = z*, making z*_ 1 = x a+b+c , Zj-i = z r If Zj > z* 
and Pj-i = Zj then j/j_i = x a+b+c and y*^ = z^-, making z 3 _i = 
Xa+b+c, z *j-i — z *j- I n both cases, a(j — 1) = a(j), but Bj-\ = 
B 3 U Cj \ {x a+b+c }, so c(j - 1) = 0, b(j - 1) = 6(j) + c(j) - 1. In this 
case we have Aj_x > 0. 

4. The last remaining possibility is that pj-i = min{zj, z*}. 

(a) If c(j) = then = z* and y*_ x = Zj so Xj-i = X*_ 1 . We have 
Aj_x G {-1,0, 1} and A 4 = for all i < j - 1. 

(b) If c(j) > and Zj < z* then t/j_x = x a+b+c and y^Lj = z^, making 
Zj-i = x a+b+c , z*j_ x = Zj. If z 3 > z* then = .T a+f)+c and 
yj-i = z*, making z*_ 1 = x a + b +c, z j-i = z j- In both cases a(j— 1) = 
a(j)+b(j) because the set Aj-i = AjUBj, and Bj-x = Cj \{x a+ 6 +c }, 
so c(j — 1) = 0, 6(j — 1) = c(j) — 1. In this case we have Aj_x > 0. 
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We have shown that if Xj is of the form shown in (|3.1j) then Xj-i will be 
of the same form. Furthermore, if {zj,z*} ^ 0, then {zj-x, Zj_x} = (i.e. 
Xj-i = Xj-i) can only occur if c(j) = 0, see cases HU [35J andl4al In addition, 
we observe that |Vj| = n — j when {zj,z*} — 0, and if {zj,z*} ^ then 
|Vj| = n — j — 1. We have also seen that as j decreases: 1) the parameter 
c(j) never gets larger, and 2) the parameter b(j) decreases by 1 if Pj-x G Bj 
and otherwise can only decrease if Pj-x £ {zj, z*}. We end our analysis of the 
decoding algorithm with one last observation, which is that Aj = —1 for at 
most one value of j, which is clear from an examination of cases [2a| [3aJ and l4"al 
since only in these cases can Aj = —1, and in every case Aj = for all i < j. 

In light of the knowledge that Aj = —1 at most once, that A p = 1 on £ c , 
and of (|3.2[) . we now see that (on £ c ) if there are I + 1 indices ji, . . ■je+i < /x 
such that Aj = 1 (for all i £ {jx, . . . jt+x})> then A > I. Thus in order to show 
that A(T, T*) > £ it suffices to find £ + 1 such indices. So we have reduced the 
'global' problem of bounding (from below) A = Ao + • • • + A n _2 to the 'local' 
problem of showing that it is likely (on £ c ) that for at least £ + 1 indices i < fi 
we have Aj = 1. We will begin this process in the next section. 

4 Upper Bounds 

We now begin the process of showing that for any positive integer £, 

P ({A = t } n £ c | n) = O (V 1/3 In 2 n) . (4.1) 

The event £ is the event that p^jpt £ V^+i U {maxX M+ i}, which is the event 
that X,j, — X^ (equivalently {z^, z*} = 0). So on £ c we have {z^, z*} ^ 0, and 
£ c is the union of the following events: 

1. Ex := {&(//) < U n {{z p , z^} ^ 0}, <5 n = n 1 / 3 , 

2. £ 2 := {6( M ) > S n }, 

so 

p ({A = £} n f c I ^) < p (f 1 1 fi) + p ({A = 1 } n £ 2 1 m) • 

Let us show now that 

P (£1 1 n) = 0(S n /n). (4.2) 

Proof. Consider the sets 

Xp + i = X^+i ={xi < ■■■ < V M+ i = Vp+i = [n] \ 

On £1 cither: 1) max{p M ,p*} £ V M +i and min{p M ,p*} is one of the [<5nJ largest 
elements of A^+i, or 2) p p G <-f M +i and p* is separated from p^ by at most 
[S n \ elements of X^+i. So denote by T the event that maxfp^p* } G V M +i and 
min-fj> M ,p*} is one of the [S n \ largest elements of X^+x- Then 

T QUx := {at least one of is one of the [S n \ largest elements of X^+i}. 
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Because p^ is chosen uniformly at random from [n] and is chosen uniformly 
at random from [n] \ {p t i\ 1 a union bound gives us 

P (T | M ) < P (Wi | m) < — + = 0(*n/n). 

n n — 1 

On the event £1 \ J 7 , we must have PfnP*^ £ X^i+i and there must be at most 
[S n \ elements of X^+i separating^ fromp*. Thus we define 

u 2 ■= {ph = Xj e x^ +1 ; p* e y, } , 

3>j := {2Wn{l,i-|5„J}> ' ■ • )^max{/i+l,i+L<5„J}} \ fe} £ ^+1) 1 3^1 < 2|£nJ 

and observe that £\\T QU2- Then we have 

P (W 2 1 M) = X! P G£ e ^ l^f = x i ; <") P (Pf = ^ e 1 ^ 

3=1 

^ 2|£„] 1 

< >^ r- = 0{S n n). 

' n — In 

3=1 

□ 

So we have proved (|4.2|) . and from now on, we may assume that = 
is at least [~<$ n "|. Further, £> M C Xj \ {zj}, and = /1, so we must have 

A* > r<5 , 7 . ] + 1 on the event £2- So from here on we will also be restricting our 
attention to fi > \S n ~\ + 1. 

4.1 The event £ 2 

In order to deal with £2, we will begin at step p — 1, with . . . ,p n -2 

already chosen, and we will begin choosing values for a number of positions 
Pj = p* (j < fi) of our P-strings. We will find that with high probability (whp) 
at some step r = t(P, P*) we have c(r) = 0, but b(r) is on the order of S n . So 
we will have at least b(r) values of pj (j < fi) left to choose, and it is likely that 
for at least £+1 of those choices we will havepj 6 Vj+i- From case [2b] of section 
[3l we know that when this happens there are three possibilities: 

1. the event TCj := {hp(zj+i) = pJ+i} occurs, 

2. the event H* := {hp*(zj +1 ) = Pj+i} occurs, or 

3. A, = 1. 

The event Tij U H* is unlikely to occur often, so (whp) we will have Aj = 1 for 
at least £ + 1 values of j < /1, which means that A > £ (whp). 

To prove this, let us define the random variable 

t(z) = r(z)(P, P*) = maxij : c(j) < z} ( M > \S n ] + 1), 
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and the events 



S :={b(T(0))>2- 12 S n }, 6 = 6 n :=n 1 / 3 , (4.3) 
7i := {r(S) - r(0) < 2/3 n }, T 2 {r(0) < n - /3 n }, (3 n := n 2 ' 3 In 2 n. 

We observe that for M<»we have t(u) < r(u) because c(j) is a non-decreasing 
function of j (j < ^). Further, we note that if t(z) < fi, then |C T ( 2 ) +1 | > z + 1, 
and because Cj C Xj \ {.Zj}, we have |X r ( z ) +1 > z + 2. Since |X, | = j, it must 
be true that t(z) > z + 1, and in particular we have t(S) > S n + 1, t(0) > 1. 
These bounds also hold if r(<5), r(0) = \i. By a similar argument we can see that 
if 6(r(0)) > 2~ 12 5 n (as on the event S) then we must have r(0) > 2~ 12 <5„ + 1. 
Finally, the following set containment holds for any sets S,T\,T 2 : 

{A = £} n £ 2 c 25° u r 2 c u (s c n r x n e 2 ) u ({a = e} n 5 n r 2 ) . (4.4) 

In this section we will show first that 

P(7i c | M )=C»(n- 1 ), (4.5) 

second that 

P (T 2 C I /x) = 0(A,/n), (4.6) 

and finally that 

P(S c nT 1 n£ 2 \tx) = 0(j3 n /n). (4.7) 
In section 1431 we will prove that 

P({A = £}nSnT 2 \fx) = 0(S n /n). (4.8) 

Combining results ((43)) - ([48)) will prove, via ((44)) . that 

P ({A = £} n £ 2 | /i) = 0(/? n /n) - O (V 1 / 3 In 2 n) . 

Since we are ultimately interested in the event {A = £} P\ S D T 2 , which 
depends on r(0), why must we concern ourselves with t(5) and T\1 To explain 
this, we must introduce the event 

Zi ■= {Pj i {Zj+U for t < j < At}, (1 < * < A*), (4.9) 

Z s := {pj f {zj+i, z* +1 } for t(S) <j< fi}, Z Q := {pj f {z j+1 , z* +1 } for r(0) < j < A*}- 

For a fixed integer i > 1, we know if the event Zi occurred after examining 
Pi, . . . ,p n _ 2 ,p*^, while the events Zg, Zq require knowledge of allpi, . . . ,p n - 2 ,p*^. 
Of course if we condition on r(0) or t{$) then these last two events require knowl- 
edge of only Pt, . . . ,p„_ 2 ,p*, for r = r(0), r(8). Also, if r(<5) = \i (respectively 
if t(0) = a«) then the event Zg (respectively Z ) trivially occurred. 
To see why we must consider t(S), note that on the event 

{pj = mm{z j+1 , z* +1 } for r(0) < j < t(S)} C Z c 
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we could have 



c(j + 1) « S n => b(j) = c{j + 1) - 1 « 5n, c(j) = 0, 

see case I4bl of section [3] This is a problem because we want 6(r(0)) to be at 
least on the order of 5 n . But if the event Zg occurs, then for some j > t(S) 
either: 

Pj = min{zj + i, z* +1 } and 

c(j + 1) > S n + 1 6(j) = c(j + 1) - 1 > <5„, c(j) = 0, 

(see case !41o[) . or pj — max{zj+i, z* +1 } and 

c(j + 1) > S n + 1 b(j) = b(j + 1) + c(j + 1) - 1 > S n , c(j) = 0, 

(see case 15c)) . 
Thus 

z| c s =^ 5 C n Ti n £ 2 c (5 C n z n £ 2 ) u (2g n z 5 n T x ) , 

which means that 

p {s c n Ti n £ 2 1 /i)<p(s c nZon£ 2 |/i) + P (^ C n z 5 n T x | m) • (4.10) 

In the process of proving (|4. 5|) , we will show that 

P(ZZnZ 5 nT 1 \») = 0(l3 n /n), (4.11) 
and later in this section we will prove that 

P(S c r\Z a n£ 2 \^ = 0(n- 1 ) . (4.12) 

The combination of (|4~T0|) - (|4~T2|) implies (|4~7|) . To conclude our remarks on the 
events Z$, Zg, we note that an examination of their definitions shows that on 
Zs (respectively on Z ) we cannot have reached t(S) (resp. r(0)) by choosing 
Pj G {zj + i, z* +1 }. Hence for t(S) < fi (resp. r(0) < /i) we must have reached 
these points by choosing pj £ Cj+i U Vj+i, which in turn implies that the 
parameter c(j) > c(j + 1) — 1 for j > t(5) (resp. t(0)). On the other hand, on 
the set Zg we have t(5) — r(0). 

In the following proofs, we will occasionally show that P (B \ /i) — > by first 
showing that for some event A we have P (A c \ fi) — * 0, and then showing that 

Obviously the result above proves that P (B n A \ fi) — > as n — > oo. A condi- 
tional probability like the one above is only defined as long as P (.A | fi) > 0, but 
of course ifP(A\fi) — then because B C A D A c we must have P (B | /x) ^0 
anyway. Thus whenever we discuss conditional probabilities we will assume 
(and not prove) that the event we condition on has positive probability. 
Let us begin proving the results we have discussed. 
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Lemma 4.1 Let 71 = {t(S) — r(0) < 2/3„}, and let Zq,2s be defined as in 
gU). Then 

P(T{ | n) = O (n^ 1 ) , P(Z C n Z S n 71 I m) = O03„/n). 

Proof. We will start with the second of the results above. We will condition on 
the value of t(S), and introduce notation for events conditioned on that value: 

P(W|r; /i) :=P(W\t = t(S); fi). 

With Zi defined as in (|4.9p . we observe that Zj C Zj_|_i. If the set{zj+i, 
is empty, then the (conditional) probability that pi £ {^i+i, z* +1 } is 0, and if 
the set {zi + i, z* +1 } is non-empty, and the (conditional) probability that pi £ 
is 2/n. Thus we have 

P (2f (~l Z i+ i | r ; ^i) <2/n, (1 <* < /x- 1). (4.13) 

To avoid having to condition also on the value of t(0), we introduce 2$, where 
<f> = max{r(<5) — 2 [/3 n J , 0} and note that with this definition, 2^ D 71 C 2q n 71 . 
Also, a consideration of the definition of -E, shows that on Z^ fl Z5 we have 
t(<S)- t(0) <2L/3„J. 

From the law of total probability we have 

M 

P(^nZ^)= £ P(^n^|r;/x)P(r = r(5)|/i). (4.14) 

t=L<5J + 1 

Since r — < 2/3 n , we obtain from (|4.13p the bound 

r-1 

P(^n^|r;/i) =5^P(ZFnZ, +1 |r; /x) 

< 2&(2/n) = 0(/J„/")- (4.15) 
This bound is independent of r, so (|4.15|) . combined with (|4.14p shows that 

P [2% n 2 S I fi) = 0(p n /n). (4.16) 
Because Z n Ti C Z n 71, we have Z£ n 71 C Z£ n71, so 

z c n z 5 n Ti c n 2 S n 71 c 2 S n z£, 

and (|4.16|) implies that 

P(2^r\2 s nT 1 \ f i) = 0(p n /n). 

Further, on the event Z| we have r(0) = r(<5) and on the event Z$ PI Z| we 
have t(<5) — r(0) < 2/3„, therefore 



z s c c 71, Zsnzjcr^ 7i c = r x c n z 



4>- 
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Thus 

P(T 1 c | / i) = P(T 1 c nZ | A1 ). 

Now {t(S) < 2[/3 n \} C T\, so when bounding the probability above we may 
restrict our attention to t(S) > 2[p n \. Hence 

n-2 

P(T 1 c n^|/x)= ]T P(T 1 c nZ <p \r; f i)P(r(S)=T\ t i). 

r=2L/3„J+l 

To complete the proof of the lemma it is sufficient to show that 

V(T{C\Z^\ T -ii) = 0{n- 1 ). 
Toward this end we define 

e„ := — !— , v = v n := [e n /3 n /5 n } , k = k n := [8 n /e n \, 
Inn 

observing that 

k n v n <Pn, k n »S n , k n vl»n\nn. 

Then we consider the sub-string (p T -2»k, ■ ■ ■ ,Pt-i), which can be divided 
into 2k segments of length v, leading us to introduce the notation 

P(i) ■= (p m (i), ■ ■ ■ ,Pm(i-i)-i)> m(i) := t - iv, (1 < i < 2k), 

and 

T>i := {pj e Vj+i for at least one pj G P (i)}. 

The event 7^ c is the event that in steps r — 1 through t - i/fc wc add fewer than 
<5„ elements of C T ^) a s vertices of the pair of trees we are building. Because 
every choice of a pj e Vj+i forces us to add a vertex from Cj + \, and because 
k » S n , we have 

2k 
i=k+l 

So let us bound from above P (V\ \ Z$ ; r ; /i) . 

On the event Z^, we have |Vj+i | = n — (j + 1) — 1 for r — 2kv < j < r — 1. 
Thus 

n — j — 2 

P (pj £ V j+ i | Z ; r ; //) = 1 ^_ - , 

and the events ^ Vj+i are conditionally independent for r — 2fc^ < j <r—l. 
Also for m(i) < j < m(i — 1) — 1 we have 

iVj+il = n-j-2>n-(T-(i-l)v-l)-2 

> n - (n - 2 - (i - l)v - l) - 2 

> - 1)za 
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Thus we obtain the bound 



m(i-i)-l 

P(Vf\Z^,r; p) = J[ P (p 3 £ V j+ i | Z ; r ; p) 

ro(i-l)-l , . , 

n — j — 2 



- n 1 



71-2 

J=m(i) 



< A _ ^"^ y < e -(-D- 2 /(n-2)_ 



(4.17) 



Hence 

2 k 

P(71 c |^; r; M ) < ]T P (P, c | Z ; r ; M ) 

i=k+l 

< fce -^ 2 /(«-2) = ( n -l) 5 

and we find that 

P (T x c n Z | r ; M ) < P (T x c Z ; r ; p) = O (n" 1 ) 



□ 



Lemma 4.2 Lei 72 = {t(0) < n — /?„} anci Zei i?o,-2(5 ^ e defined as in f|4. 9|) . 
XTien 

P(T 2 c | M ) = 0(/3„/n). 

Proof. Recall that by definition, r(0) < jtx, so the probability above is zero if 
p < n ~ (3 n , and we may assume that p > n — /3 n . Now let us consider the set 
-Ep, where p = p — \_/3 n \ — 1, and observe that on this event c(j) > c(j + 1) — 1 
for p < j < p. Thus 

r 2 c c z p c u { c (/x) < L/?„J + 1}, 

and 

P (T 2 C | p) < P (Z p c | p) + P ({c( M ) < L/3„J + 1} | /i) • 
We first observe that, by an argument similar to that in (|4.15[) . we have 

P (Z p c | p) = 0(p n /n). 

Then we note that 

{c(p) < L/3„J +1} QU 1 UU 2 , 

where 

U\ := {max{p M ,p*} 6 V^+i}, 

U2 ■= {max{p M ,p* } is one of the [(3 n \ + 2 largest elements of X^ + i}. 
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So we have 

P ({c( M ) < L/3„J + 1} | At) < P (Wi | u) + P (W 2 | ft) . 
Now for /x > n — /3 n , 

„ / , , n — u — 1 ri — it — 1 ^ , „ , . 

P (Wi | /*) < ^ + = 0{(3 n n), 

n n — 1 



and 



Thus 



n n — 1 



P({c(/i)< L/y +l}|M)=<Wn). 



□ 



Lemma 4.3 Let S = {6(r(0)) > 2~ 12 (5„}, and ief -Z &e defined as in l|4.9p . 
TTien 

F(5 c ns nf 2 |At) = o^- 1 ) . 

Proof. Consider the event Zo fl {t(0) = r}. On this event, if r < j then the 
only way we can have b(j) < b(j + 1) is if we choose pj G Bj + i, see section [3] 
case[TJ On the event £ 2 H 5 C we have > <5„ but 6(r(0)) < 2~ 12 £„. Thus on 
the event £ 2 fl iS c (~1 i?o we must have chosen pj £ -Bj+i more than (1 — 2~ 12 )b(fi) 
times over the range of indices 1 < j < fi — 1. We will show that this is unlikely 
to occur. 

Toward this end, we will divide the substring (pi, . . . ,p^-\) into segments 
again, this time letting k(i) = min{0,/i — in/12}, and for i > 1, we let 

:= n {x g {pjfe(i), .. .,^-x}}, u< := |Z4|, (Wq := B^). 

So ^ (which depends on Pfc(j), . . . 2ii?u) is the set of elements of i? M which 
have not been chosen as a pj for j > k(i). We will show that with high probability 
Ui+l > for < i < 11, because if this happens for each such i then we 
must have ui 2 > 2~ 12 uq. On the event £ 2 , this implies the event S. 
Thus we have 

i-1 

£ 2 ns c nz c j? 2 n£ 2) J % ■= f|{«i+i > V 2 >> ( l > x )- ( 4 - 18 ) 

As 

12 



p (j^ n £ 2 1 m) = P (Ji c n£ 2 \n) + J2P(Jfn Ji-i r\£ 2 \n) 

i=2 
12 

<P(J 1 C \£ 2 ; M)+E P (^l^-i n ^; At), 
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it is enough to show that 

P (Ji c | £a 5 A*) , P(Ji\Ji-in£ 2 ; /J,) = 0(n- 1 ), (2<i<12). (4.19) 

We will prove the result above for P (Jf \ Ji-\ fl £2 ; m) ~ the proof for P (JTf | £% ; fJ.) 
is similar. Denote by P {\U%\ =u \ J%-\ n £2 ; A 1 ) the conditional probability that 
at the end of step fe(i), the set Ui is a specific set of cardinality u (Ui = 
{u>i, . . . , W u }), and by P < u/2 \ Ji-i D £2 ; = it ; (i) the conditional 

probability that Uj+i < u/2 given the fixed set (and given n £2,^)- 
Then 

n 

P W c I Ji-i n ^ 2 ; n) = J2 v(ui+i<u/2\j i _ 1 n£ 2 ; pi\ = u-,n) 

u=\2- i S n \ \Ui\=u 

■V{\Ui\ =u\Ji-in£ 2 ; n), 

where the outer sum above is over the cardinality of Ui and the inner sum is 
over all subsets of [n] of that cardinality. The outer sum starts at u = \2~ z 8n\ 
because conditioned on Ji-\ fl £ 2 , we must have 

u > 2~ l %) > 2~ l 5 n . 

So we can prove (|4.19p by showing that 

P (tti+i < u/2 I Ji_ x n £2 ; \Ui\ = u ; fi) = O (n- 1 ) , (4.20) 

where the O(-) bound above is uniform over all sets Ui of cardinality at least 
2~%. 

The probability in |g2D)| is equal to N(U i )/n k ^- k ( l+1 '> , where 

1. N(Ui) — the number of P-strings segments {pkU+i)i ■ ■ ■ >Pfc(i)-l) such that 
we choose at least half of the elements of Ui as entries pj of our segment, 
and 

2. n k ( l )- k (* l + 1 ) — the total number of P-strings segments (pk(i+i) , ■ ■ ■ >Pk(i)-i)- 

Because we want to count P-strings segments, it is important that conditioning 
on the events Ji-\ fl £2 and \Ui\ — u requires knowledge of (pk(i), ■ ■ ■ ,Pn-2),Pu 
but not of the value of Pj for j < k(i) — 1, and it is also important that for 
each i, k(i) is a fixed number once we have conditioned on /i. Before we begin 
counting, let us also introduce the notation 

(4, := «(z-l) ■■•(«- i + l), d= [u/2\, 

and note that for large enough n we have d > 2~ 12 6 n . To find an upper bound 
for N(Ui), we 

1. choose d out of k := k(i) — k(i + 1) < n/12 positions, 

2. choose d distinct elements of Ui for those positions, and 
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3. then we choose any value of pj for the remaining k — d positions. 
Thus, (for k — d > 0) we have 

N(Uj) < A\ (u) d n k - d 
n k ~ \d J n k 
_ /u d k d e d 
\ d d n d 

= O (2 d e d l2- d ) = O (2- 5 "/ 212 ) = O (n- 1 ) . 
For k - d < 0, N(U t ) = 0. This proves (TC2"0)) . 

□ 

In this section we have shown that 

P ({A = £} n £ 2 I jtx) = P ({A = £}nSDT 2 \fx) + 0(/3 n /n). 
In the next section we will consider the event {A = £} n S n 7a. 

4.2 The event {A = £} n 5 n T 2 

Recall from case[2b]of section |3] that if b(j) > 0, c(j) = and we choose Pj-i G 
Vj = Vj n V* then there are three possibilities: 

1. the event Wj-i := {/ip(zj) =Pj} occured, 

2. the event 'H*_ 1 := {/ip*(z*) = pj} occured, or 

3. Aj-i = 1. 

On the event 5, we have 6(r(0)) > 2~ 12 S n which implies that r(0) > 2 _12 <5„ + 1 
(see the discussion following (|4.3[) ). Thus at step r(0) we have at least 2~ 12 S n 
values of pj (j < r(0)) left to choose, and we will show that it is likely that we 
will have pj G Vj+i at least £+1 times, and it is unlikely that TL*, Tij will occur 
for these pj . In this fashion we will show that 

P({A = £}nSnT 2 |^) =0($„/n). (4.21) 

To be more specific, we will let 

v = u n :=[2- 12 8 n /k\, k:=£+l, 

and we will condition on the value of r(0) (r(0) = r), dividing the substring 
(p r _ku, ■ ■ ■ ,Pt-i), into k segments of length v, as we have done before. We will 
find that this time we need to leave the first element of each segment as a buffer 
between adjacent segments, so we use the notation 

P~(i):=(p m (i),---,Pm(i-i)-2), m(i):=T-iv, (1 < i < k), 
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to denote the last v— \ elements of the i th segment. On the event T2 — { T (0) < 
n — /?„} we have 

|V i+ i| >n-j-2>p n -l, (1 < j < t - 1). 

Introducing the event 

:= fa (£ {z s+ uZj+i} for r - kv < j < r}, (4.22) 

we note that 

p (pj e v j+ i |2,nT 2 n5;T;/i) = n ~ J _~ 2 > 

and that the events pj G Vj+i are conditionally independent for r — kv < j < r. 
We will show that the event Z% is unlikely to occur conditioned on T2 <1S. Then 
we will find that, conditioned on Z*, it is likely that the event 

C := {we choose at least one Pj <E Vj+i in each segment P~(i)} 

occurs. At the same time we will prove a result which involves the buffer ele- 
ments, i.e. for p(i) := m(i — 1) — 2 (1 < i < k) it is unlikely that the event 
'Hp(i) U H*/^ will occur. With all these results established, we will then be able 
to prove (|43T]) . 

Lemma 4.4 Conditioned on t(0) = r, let Z* be defined as in (|4.22|) . TTien 

F(Z, c |T 2 n5; r; p) = 0(S n /n). 
Proof. Let us begin by defining 

ZS) ■■= {Pj £ {z j+ i,z* +1 } for r - i < j < t}, Z*(0) := {{v T , v*} ? 0}. 
By the same argument as in f|4. 13[) . we have 

P(Z*(i) c nZ*(i-l)\T 2 nS; r; p) < 2/n, (1 < i < kv). 

So 

kv 

P(Z:\T 2 nS;T; p) = Y, P ( Z *^ CnZ *( l - 1 )\ T 2 nS '> T '> 
< 2kv/n = 0(S n /n). 

□ 

Next, let 

k 

C := P| Ci, d := {pj 6 Vj+i for at least one pj E P~(i)}, 
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and define 

k 

ffW = ffW(P, P*) : = ^ I WpW u«; (<) , P(i) := m(* - 1) - 2, 

where denotes the indicator of the event A So i? ( p ) counts the number of i 
for which 7i p (i) U W*/^ occurs. 

Lemma 4.5 LetC,Ci and be defined as above. Then 

p (c c u {irW > o} I z» n r 2 n 5 ; r ; ^) = o (n- 1 ) . 

Proof. If we condition on Z* n 7-j n 5, then for r — fc^ < j < r we have 

IV3+1I =n-j-2, \{zj+u Zj+i}\ = 2, 
and the events j»j 6 Vj+i are conditionally independent, with 

77 — 1—2 /? — 1 

P ( P] G V,-+i |2,nT 2 n5;r;/i) = ^— > 

77 — 2 71 — 2 

Thus, as in (|4.17p . we obtain 

p(Cf\z t nT 2 nS;r;^ = fj (i- n ~^~ 3 ) 

j=m(i) 

Since C c = U^ =1 Cf, we use a union bound to obtain 

fe 

P(C C I Z* n ^ n 5 ; t ; /x) < ^P (C? | Z* n T 2 n 5 ; r ; m) = O (77- 1 ) . 

i=l 

Next we consider flw. Conditioned on the event Z* 07^05, we have |{zj + i, z* +1 }| 
2 (for r — kv < j < r) , so 



1/ (n - 2), h P (z p(i)+1 ) £ {z p{i)+2 , z* {z)+2 }, 
0, otherwise. 



P {U p{l) |2,nr 2 n5;r;^): 
So we have 

p (n p{i) I n r 2 n 5 ; r ; /i) = p (h; w | z» n r 2 n 5 ; T ; /*) < ^ , 

and a union bound gives us 

p(w p(!) uH; (4) |z,nr 2 n5;r;^ -db' 
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Hence 

P ({ff (p) > O} I Z„ n T 2 n S ; t ; /j) < E [i? (p) I Z» n T 2 n 5 ; t ; /i 

fe 

= J2 P U H pW 1 2* n r 2 n 5 ; t ; ^ 



< 



i=l 

n - 2 



□ 

(4.23) 



In order to complete our proof, we introduce the notation 

5=cnz» n r 2 n <s n {h (p) = 0} 

and observe that 

{A=i}n5nr 2 c2, c u(c c u {zrM > 0} n 2. n r 2 n 5) u ({A = 1} n Q) . 

Lemmas 14.41 and 14.51 imply 

P ({A - £} n S n T 2 | r ; /*) - P ({A = ^} n 6 | r ; /i) + O (5 n /n) . (4.24) 
so it remains only to prove the following lemma. 
Lemma 4.6 

P({A = £}ng\r; /i)=0. 

Proof. On the event G, we will choose at least one Pj E V7+1 from each segment 
P~(i). Thus we can consider the (random) subset of indices 



r = { 7 (i) < • • • < 7 (fc)}, 



(4.25) 



for which 7(1) is the largest element of {m(i), . . . , p(i)} such that p-yU) £ V 7 (j)+i. 
This makes the last entry of the segment such that pj 6 Vj+i. 
We also define 



»=l 

From the discussion at the beginning of this section, we can see that 

gn{ff (7) =0} C {A = i} c , 

which means that 

{A = £} n Q c {iJ (7) > 0} n 6? c > 0} n {ff (p) = 0} . 

To prove this lemma, it is enough to show that 

{# (7) > 0} n \h {p) = 0} = 0, 



(4.26) 
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which we can accomplish by proving that 

Hb\P,P*)<HV)(P,P*) (4.27) 

for all (P, P*) € Q. We begin by noting that, conditioned on Z*, 

Pi e V? +1 = Aj+i u B j+1 u {z j+1 , z* +1 } =^ Pj e A j+1 u 

for t — kv < j < t. Thus if 7(1) = j < p(i), then 

pj+i G Ay +2 U B j+2 . 

Now, recall that the elements of Aj+2 U -Bj+2 have noi appeared as any entry pi 
{i > i + 2), but both /ip(zj + i), Zip. (zj +1 ) have appeared as some pi (i > j + 2). 
Thus 

kp(zj+i):^*0*j+i) ^ u - B i+2 =>P.?+i 7^ ^p(^+i),^p*(^*+i)- 
Consequently, 

{ T (i)<p(i)}c(w 7(l) uH; (l) ) c , 

which means that 



W 7(i ) U C 



(w / , (i) u7<; (i) )n{7(*) = p(i)}- 



So for every i (1 < i < k), 



which proves P~2"7) . □ 



5 Conclusion 

In [5], Paulden and Smith conjectured that P (A = I > 1 1 \i) was on the order 
of n" 1 (conjecture 3 on page 16). We agree with this conjecture, even though 
we have only proved that P (A = I > 1 1 ^) is on the order of n _1 / 3+0 ^ 1 ^. Our 
bound implies that 



hm PfA< n > >n 1 / 3 -°( 1 ) 



Thus, for large n, we should expect that a mutation in a P-string changes the 
structure of the tree by either one edge or by many edges, with little likelihood 
of anything in between occurring. 
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